Matching Linked Open Data Entities to Local Thesaurus Concepts
نویسندگان
چکیده
We describe a solution for matching Linked Open Data (LOD) entities to concepts within a local thesaurus. The solution is currently integrated into a demonstrator of the PoolParty thesaurus management software. The underlying motivation is to support thesaurus users in linking locally relevant concepts in a thesaurus to descriptions available openly on the Web. Our concept matching algorithm ranks a list of potentially matching LOD entities with respect to a local thesaurus concept, based on their similarity. This similarity is calculated through string matching algorithms based not only on concept and entity labels, but also on the “context” of concepts, i.e. the values of properties of the local concept and the LOD concept. We evaluate over 41 different similarity algorithms on two test-ontologies with 17 and 50 concepts, respectively. The results of the first evaluation are validated on the second test-dataset of 50 concepts in order to ensure the generalisability of our chosen similarity matches. Finally, the overlap-, TFIDFand SoftTFIDF-similarity algorithms emerge as winners of this selection and evaluation procedure.
منابع مشابه
Utilizing, Creating and Publishing Linked Open Data with the Thesaurus Management Tool PoolParty
We introduce the Thesaurus Management Tool (TMT) PoolParty based on Semantic Web standards that reduces the effort to create and maintain thesauri by utilizing Linked Open Data (LOD), text-analysis and easy-to-use GUIs. PoolParty’s aim is to lower the access barriers to managing thesauri, so domain experts can contribute to thesaurus creation without needing knowledge about the Semantic Web. A ...
متن کاملExploiting multilinguality for ontology matching purposes
The alignment between linguistic artifacts like vocabularies, thesauri, etc., is a task that has attracted considerable attention in recent years [1][2]. With very few exceptions, however, research in this field has primarily focused on the development of monolingual matching algorithms. As more and more artifacts, especially in the Linked Open Data realm, become available in a multilingual fas...
متن کاملPoolParty: SKOS Thesaurus Management Utilizing Linked Data
Building and maintaining thesauri are complex and laborious tasks. PoolParty is a Thesaurus Management Tool (TMT) for the Semantic Web, which aims to support the creation and maintenance of thesauri by utilizing Linked Open Data (LOD), text-analysis and easy-to-use GUIs, so thesauri can be managed and utilized by domain experts without needing knowledge about the semantic web. Some aspects of t...
متن کاملReuse of library thesaurus data as ontologies for the public sector
In the spring of 2013 the National Library of Finland took on a mission to build a national level ontology infrastructure across the public sector. Central to this endeavor is the open source ontology service Finto and the General Finnish Ontology YSO. The ontology is based on the General Finnish Thesaurus, originally developed in the 1980s in the National Library mainly for book indexing. Movi...
متن کاملArticle (refereed) -postprint Tagging of Environmental Data Using a Novel Skos Formatted Environmental Thesaurus [in Special Issue: Semantic E-sciences] Earth Title: Automated Tagging of Environmental Data Using a Novel Skos Formatted 2 Environmental Thesaurus. 3 4
The NERC and CEH trademarks and logos ('the Trademarks') are registered trademarks of NERC in the UK and other countries, and may not be used without the prior written consent of the Trademark owner. Abstract 18 There is increasing need to use the widest range of data to address issues of environmental 19 management and change, which is reflected in increasing emphasis from government 20 fundin...
متن کامل